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Entanglement is the key quantum resource for improving 
measurement sensitivity beyond classical limits. However, 
the production of entanglement in mesoscopic atomic sys¬ 
tems has been limited to squeezed states, described by Gaus¬ 
sian statistics. Here we report on the creation and charac¬ 
terization of non-Gaussian many-body entangled states. We 
develop a general method to extract the Fisher information, 
which reveals that the quantum dynamics of a classically 
unstable system creates quantum states that are not spin 
squeezed but nevertheless entangled. The extracted Fisher 
information quantifies metrologically useful entanglement 
which we confirm by Bayesian phase estimation with sub 
shot-noise sensitivity. These methods are scalable to large 
particle numbers and applicable directly to other quantum 
systems. 

Multiparticle entangled states are the key ingredients for ad¬ 
vanced quantum technologies [1]. Various types have been 
achieved in experimental settings ranging from ion traps [2], 
photonic systems [3] and solid state circuits [4] to Bose-Einstein 
condensates. For the latter, squeezed states [5, 6] have been 
generated [7-12] and a rich class of entangled non-Gaussian 
states is predicted to be obtainable [13] including maximally 
entangled Schrodinger cat states [14, 15]. The production of 
these fragile states in large systems remains a challenge and 
efficient methods for characterization are necessary because full 
state reconstruction becomes intractable. Here, we generate a 
class of non-Gaussian many-particle entangled states and reveal 
their quantum properties by studying the distinguishability of 
experimental probability distributions. 

A measure of the distinguishability with respect to small 
phase changes of the state is provided by the Fisher informa¬ 
tion F [16]. It is related to the highest attainable interfero¬ 
metric phase sensitivity by the Cramer-Rao bound A6 >cr = 
1/VF [17]. This limit follows from general statistical argu¬ 
ments for a measurement device with fluctuating output [18]. 
The Fisher information is limited by quantum fluctuations of the 
input state as well as the performance of the device. Even in the 
absence of technical noise, the Fisher information of a classical 
input state is F < N because of the intrinsic granularity of 
N independent particles which translates into the shot-noise 
limit AO >1/ s/N for phase estimation. This classical bound 
can be surpassed with a reduction of the input fluctuations by 
introducing entanglement between the N particles [5]. These 
states, known as squeezed states, are fully characterized by mean 
and variance of the observable and already employed in preci¬ 
sion measurements [19-21]. In contrast, non-Gaussian quantum 
states can have increased fluctuations of the observable but nev¬ 
ertheless allow surpassing shot-noise limited performance. A 
textbook example is the Schrodinger cat state characterized by 
macroscopic fluctuations but achieving the best interferomet¬ 
ric performance allowed by quantum mechanics, i.e. at the 
fundamental Heisenberg limit F = N‘^ [22]. In general, the 


class of states that are entangled and useful for sub shot-noise 
phase estimation is identified by the Fisher information criterion 
F > A [13]. Exploiting these resources requires probabilistic 
methods for phase estimation such as maximum likelihood or 
Bayesian analysis [23] which go beyond standard evaluation of 
averages. 




FIG. 1: Preparation and detection of non-Gaussian entangled 
states. (A) Array of Bose-Einstein condensates in an optical lattice 
potential addressed by microwave and radio frequency fields. (B) The 
interplay of nonlinear interaction (blue) and weak Rabi coupling (red) 
between the internal states \a) and \b) results in an unstable fixed point 
in the classical phase space. The state of the system is visualized on 
a generalized Bloch sphere with radius J = A/2. Gray lines indicate 
trajectories of the mean-field equations of motion [18]. The initial co¬ 
herent spin state (green) ideally evolves into a squeezed state (orange) 
followed by non-Gaussian states at later evolution times (violet). Edges 
of shaded areas are contours of the Husimi distribution for N = 380 at 
1/e^ of its maximum. (C) Experimental absorption picture, showing 
the site- and state-resolved optical lattice after a Stern-Gerlach separa¬ 
tion. Shaded boxes indicate the sites with a total atom number in the 
range 380 ±15, which are selected for further analysis. (D) Example 
histograms of the imbalance = 2F/A after nonlinear evolution of 
25 ms and final rotation (angles indicated in the panels) compared with 
the ideal coherent spin state of identical N (green Gaussian). 
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FIG. 2: Entangled state characterization. (A) Tomographic reconstruction of the experimental state after evolution times of 15 ms and 25 ms. 
The Husimi projections (scaled amplitude is brightness coded [18]) confirm the creation of an elongated state and subsequent distortion of the 
Gaussian shape. (B) Variance analysis of the particle number difference reveals maximum spin squeezing of —4.5 zb 0.2 dB for 15 ms and 
—0.2 zb 0.3 dB for 25 ms. (C) Comparison of the normalized Fisher information F/N (red diamonds) for N = 380 zb 15 atoms and the inverted 
spin squeezing factor l/<f^ (blue circles). The gray shaded area is only accessible for non-separable (entangled) states. For 26 ms spin squeezing 
cannot identify the entanglement which is detected by the Fisher information. Lines are sinusoidal fits, error bars of the Fisher information represent 
the 68% confidence interval of the Hellinger distance method. 


A paradigm physical process that exhibits the transition from 
Gaussian to non-Gaussian states is the time evolution of a quan¬ 
tum state initially prepared at an unstable classical fixed point. 
Our experimental system is an array of interacting binary Bose- 
Einstein condensates of ^^Rb with additional linear coupling 
of the two internal states (see Fig. lA-C), which allows for 
the controlled realization of such unstable fixed point dynam¬ 
ics [24]. The linear coupling, realized with microwave and 
radio frequency magnetic fields, also permits precise rotations 
for initial state preparation and final state manipulation before 
read-out (see Fig. 1B,D). For a coherent state initially centered 
on the unstable fixed point, the quantum dynamics leads to spin 
squeezed states for short evolution times that subsequently trans¬ 
form into non-Gaussian states on an experimentally feasible 
time scale [14] (see Fig. IB). 

For the characterization of non-Gaussian states, higher mo¬ 
ments, or even the full probability distributions, have to be 
accessed experimentally. For this, the setup [8] has been ex¬ 
tended to realize up to 35 individual condensates in a single 
experiment, which permits the acquisition of sufficient statistics 
in a narrow window of ±15 for the final atom number. The 
populations of the atomic states \a) = \F = = 1) and 

\b) = |F = 2, mi? = — 1) are destructively detected for each 
individual condensate by state-selective absorption imaging with 
high spatial resolution (Fig. 1C) [25]. By repeating the exper¬ 
iment (typically many thousands of times), we measure the 
experimental probability distributions of the population imbal¬ 
ance 2 : = {Nb — Na)/N along defined directions by applying 
the corresponding spin rotation before detection. The analysis 
window for ± is adjusted according to the indepen¬ 

dently determined time scale of atom loss [18] to follow the time 
evolution starting with (N) = 470. Figure ID shows examples 


of observed distributions for two different orientations and an 
evolution time of 25 ms; the distributions are consistent with the 
theoretically expected structure of the state (see Fig. IB). 

Detailed insight can be gained by repeating this measurement 
for various angles (here in steps of 10 degrees) allowing for the 
maximum-likelihood reconstruction of the density matrix in the 
symmetric subspace [18]. Figure 2A shows the tomography re¬ 
sults obtained from 32,500 experimental realizations confirming 
qualitatively the expected behavior - at short evolution times the 
state has a squeezed shape whereas at later times the character¬ 
istic bending dynamics appears as expected from the presence 
of the two stable fixed points above and below the equator. 

Analyzing the variance of z for the same data as a function 
of the tomography angle (Fig. 2B) shows that the time evo¬ 
lution leads to suppressed fluctuations at 15 ms. Extracting 
the spin squeezing parameter [18] we find the minimum 
^min ~ —4.5 ± 0.2 dB below the standard quantum limit which 
demonstrates entanglement [5]. We note that, for all results 
reported here, the photon shot-noise of the absorption imaging 
of ±4 atoms is not subtracted. For longer time evolution the 
bending dynamics leads to increased fluctuations in all direc¬ 
tions i.e. tomography angles. After 25 ms spin squeezing is 
lost and we find = —0.2±0.3dB. However as shown 
in Fig. 2C, experimental extraction of the Fisher information 
(detailed below) reveals that useful entanglement is still present 
although spin squeezing is vanishing, i.e. F/N > 1/^^ [13]. 
At 26 ms, spin squeezing is completely lost whereas the Fisher 
information still indicates the presence of quantum resources 
(F/N > 1). In the Gaussian regime up to 23 ms, we observe 
that Fisher information and the inverse spin squeezing agree as 
expected F/N ^ l/^‘^ sls these states are fully characterized by 
their variance. 
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FIG. 3: Experimental extraction of the Fisher information. (A) 

Examples of experimental histograms of z (left panel) after 26 ms 
of evolution time obtained for the tomography angle a = 58° and 
additional small rotation angles 0 about the y-axis (see inset of C). The 
blue histogram is used as a reference for the following analysis. The 
right panel shows the square-root-differences between the respective 
histogram and the reference (see panel B). Gray bars indicate the 
vertical scales (see axes in panel B). (B) Procedure for extraction of 
the squared Hellinger distance (example of ^ = 1.5°). Probability 
amplitudes are subtracted for each bin and these differences 
squared and summed to obtain the squared Hellinger distance. (C) 
Squared Hellinger distances for three different evolution times. A 
reference measurement (black) with the initial coherent spin state (0 ms 
evolution) lies slightly outside the non-classical region (gray shaded 
area). The spin squeezed state (gray, 15 ms evolution) surpasses the 
classical limit. After 26 ms (red points, histograms shown in A) the 
state has a non-Gaussian shape, is not spin squeezed but still performs 
beyond the standard quantum limit. Error bars indicate the statistical 
68% confidence interval obtained by a resampling procedure [18]. The 
curvature of the quadratic fit is proportional io F/N. 

Our method for extraction of the Fisher information circum¬ 
vents the experimentally intractable full reconstruction of the 
density matrix and is based on a specific set of experimental 
probability distributions Pz{0) after small rotations 0 of the 
quantum state. As an example, in Fig. 3A we show distributions 
for the state created at 26 ms and the optimal tomography angle 
of 58 degrees (see Fig. 2C) after small rotations about the y axis 
(see Fig. 3C inset), which feature a pronounced peak and long 
tails characteristic for the bent state. The main effect of the 
small rotation is a continuous shift of the distribution towards 
increasing imbalance for larger angles 0. 

The analysis for extraction of the Fisher information builds 
on the statistical distance [26, 27] of these distribution func¬ 
tions. We use a Euclidean distance in the space of probability 
amplitudes ^/P^ known as Hellinger distance [18], defined as 

z 

According to the definition, we take the square root of the exper¬ 
imental probability distributions at finite 0 and 6> = 0 and cal¬ 
culate the difference for each bin (red histograms in Fig. 3A,B). 


Summing the squares of the differences provides the squared 
Hellinger distance d^{0). Figure 3C shows d^{0) as a function 
of 0 for three different evolution times and the respective opti¬ 
mal tomography angles. A resampling method [18] is used to 
extract error bars and reduce the statistical bias. The observed 
quadratic behavior is expected from the Taylor expansion 

dl{9) = ^e‘^ + 0{9^), ( 2 ) 

which reveals the close connection between Hellinger distance 
and Fisher information; this relationship is used to extract F 
from the curvature oi d'^{0). The gray shaded area in Fig. 3C 
indicates the region that is not accessible to separable states; for 
them, the Fisher information is limited io F/N < 1, resulting in 
a curvature smaller than A^/8. For the initially prepared state 
we find a Fisher information ofF/A/' = 0.91±0.04< 1, which 
is expected for a separable state. For a subsequent evolution 
of 26 ms the measured Hellinger distances lie in the regime of 
non-separability. This reveals entanglement in a regime where 
no spin squeezing is present. For the intermediate evolution 
time of 15 ms we extract a Fisher information F/A^ = 2.2±0.2, 
which confirms entanglement in the Gaussian spin squeezed 
state. For obtaining the systematic study of Fig. 2C, this proce¬ 
dure is performed at a given evolution time for different tomog¬ 
raphy angles. The reported values for the Fisher information 
are limited by experimental imperfections, detection noise and 
atom loss which especially affect the fragile non-Gaussian states. 
For the ideal time evolution, monotonically increasing Fisher 
information is expected, whereas the available spin squeezing 
is limited to 1/^^ ^ 18 (-12.6 dB). The ideal theoretical model 
prediction F/N ^ 90 (-19.5 dB) for the evolution time when 
spin squeezing vanishes [18]. 

There is a direct connection between Fisher information and 
sensitivity in parameter estimation. In an interferometric con¬ 
text, high sensitivity, indicated by a large value of F, means 
fast change of the output distribution with the phase 0, i.e. high 
statistical speed Qdn /dO = \/F/8 with respect to the parame¬ 
ter change. For the quantum state at 25 ms the enhanced Fisher 
information reveals quantum resources beyond the standard 
quantum limit in a range of tomography angles. No squeezing is 
detected for the optimum at 58 degrees, which implies that these 
resources can only be exploited with the knowledge of more 
details of the distribution functions. In Fig. 4A we show explic¬ 
itly through mean and variance analysis that averaging of the 
observable z does not surpass shot-noise limited performance 
for rotations about the y axis (corresponding for example to a 
phase shift inside a Ramsey interferometer [18]). 

However, the resource can be harnessed with model inde¬ 
pendent Bayesian estimation using the experimental probability 
distributions Fz{0). For this, we use an independent data set 
taken with the setting (a = 58°, 6>o = 0°) and 25 ms of evolu¬ 
tion time, which we divide into sequences {zi ,..., each 
containing m realizations. To obtain realistic measurement con¬ 
ditions, we discard the previous knowledge on the true value of 
the phase Oq (Bayesian estimation with flat prior [18]). For each 
distribution Fz{0i) we calculate the likelihood 

m 

C{9i) = \lP,.{9i), (3) 

which corresponds to the conditional probability to obtain 
the sequence {zi^... ^Zm} if the phase setting had been Oi. 
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Interferometry phase, 0 (deg) 



FIG. 4: Quantum enhanced phase sensitivity in the absence of 
spin squeezing. (A) A rotation on the Bloch sphere by the angle 0 
(right inset) is formally equivalent to the action of a Ramsey interfer¬ 
ometer with relative phase shift 0 (see Fig. S5). Interference fringe of 
z (blue line with 68% confidence interval) with the state after 25 ms 
evolution and subsequent tomography rotation of 61 degrees; reference 
measurement with a coherent spin state (black line with gray confi¬ 
dence region). The phase sensitivity AO deduced from standard error 
propagation via the slope of the interferometer fringe (left inset) does 
not surpass the standard quantum limit. (B) Bayesian analysis with the 
tomography angle for maximal Fisher information (58 degrees). For 
each sequence of length m a quadratic fit to log C is used to extract the 
Gaussian variance of C which corresponds to the phase sensitivity. 
1 / (Nma^) as a function of m shows fast convergence to the extracted 
F/N using the Hellinger distance method (red line with 68% confi¬ 
dence region). The error bars indicate the standard deviation of a single 
sequence. Gray shaded regions are only accessible with entanglement 
in the system. 


For sufficiently large m we expect a Gaussian distribution 
C{0) oc exp(—(6> — 6>c)^/2cr^) centered at Oc with the Bayesian 
phase uncertainty a [18]. Thus, a can be extracted from a 
quadratic fit to log C{0) (see Fig. 4B inset). We find a fast con¬ 
vergence of to the expected value 1 / mF (Cramer-Rao bound 
for m measurements) already for m > 2 (Fig. 4B). This ex¬ 
plicitly shows phase uncertainty below the standard quantum 
limit in agreement with the Fisher information obtained from 
the Hellinger distance method described above. 

In conclusion, we have developed a method based on the 
statistical distance of experimental probability distributions to 
extract the Fisher information, which was previously unattain¬ 


able in systems of large particle number. We demonstrate this 
on a novel class of collective states in binary Bose-Einstein 
condensates generated in the vicinity of a classically unstable 
point. The experimental value of the Fisher information serves 
to verify entanglement in the absence of spin squeezing and 
quantifies the quantum resource for improved phase estimation. 
We confirm this by characterizing the sensitivity of a Ramsey in¬ 
terferometer and find enhanced performance in agreement with 
the extracted Fisher information. The presented method does not 
depend on the special shape of the probability distributions and 
is not limited to small particle numbers. It is therefore broadly 
applicable to the efficient characterization of highly entangled 
states, relevant for further improvement of atom interferome¬ 
ters [8, 11, 12, 19, 20, 28-30] toward the ultimate Heisenberg 
limit [22]. More generally, it can be applied to any phenomenon 
characterizable by the distinguishability of quantum states, as in 
quantum phase transitions [31], quantum Zeno dynamics [32] 
and quantum information protocols [1]. 
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SUPPLEMENTARY MATERIALS 
Experimental system 

A Bose-Einstein condensate of ^^Rb atoms is loaded into a 
one-dimensional optical lattice superimposed with a shallow 
harmonic trap. The trap frequencies are 660 Hz in lattice direc¬ 
tion and 260 Hz in radial direction. In this tight confinement 
regime, the spin healing length is on the order of the size of 
the on-site wavefunction such that the single-mode approxi¬ 
mation is applicable. The large lattice spacing (5.5 pm) and 
the high inter-well potential barrier ensure that tunneling is 
negligible on the experimental timescale. The condensate is 
distributed over up to 35 independent lattice sites with occupa¬ 
tion numbers ranging from 100 to 600 atoms. In the reported 
experiments we address the two states \a) = \F = l^mp = 1) 
and \b) = \F = 2, mi? = —l)of the ground state hyperfine 
manifold by application of microwave and phase controlled ra¬ 
dio frequency radiation, which drive a magnetic two-photon 
transition. Nonlinear interaction between the condensate atoms 
is enhanced at a magnetic field of 9.12 G in the vicinity of an 
inter-species Feshbach resonance. An active stabilization, in¬ 
cluding a feed-forward of the 50 Hz mains frequency, reduces 
the shot-to-shot fluctuations of the offset field below 30 pG. 

The main physical limitation of the experimental system is 
the effect of particle losses, which leads to noise contributions as 
well as a mean change in the interaction dependent parameters 
(detailed below) during the evolution time. The most limiting 
loss channel is dipole relaxation of the F = 2 manifold, by 
which in every event two atoms of \b) get lost from the trap [33]. 
The corresponding 1/e decay time of pure |2, — 1) is ^ 200 ms. 
Additionally, the enhancement of inelastic scattering and three- 
body recombination caused by the closeby Feshbach resonance 
leads to loss of \a) and \b), which is symmetric on average. The 
combined loss leads to a decay time of ^ 110 ms for the total 
number of atoms. 

After the experimental sequence, a resonant 7r-pulse transfers 
the population of |6) to |F = l^mp = —1) to stop further 
dipole relaxation loss in F = 2. This allows for the controlled 
ramp-down of the magnetic field to ~ 1 G for absorption imag¬ 
ing with a precision of ±4 atoms for the individual atom num¬ 
bers Na and on each lattice site [25]. A short time of flight of 
1.2 ms after Stern-Gerlach separation reduces the optical density 
to achieve optimal conditions for imaging. 

Theoretical description 

Our system of N two-level atoms in each individual con¬ 
densate is conveniently described by the collective spin oper¬ 
ator J = with components Jx = (6^a + a^6)/2, 

Jy = (6’f'a — a^h)/2\ and Jz = (^^5 — o)a)/2, where is the 
Pauli vector of the ith particle and and 6^ are the respec¬ 
tive creation operators associated with the two modes. Jz = 
(TV^ — 7Vo)/2 is half the occupation number difference in the 
two modes while Jx and Jy are the corresponding coherences. 
The quantum dynamics of the N particles on each lattice site is 
described by the two-mode Josephson Hamiltonian 

H = x{N)Jl-nj^ + 5{N)J,, (SI) 

which is a special case of the more general Lipkin-Meshkov- 
Glick Hamiltonian. The parameters x, Q and S are the nonlin¬ 
earity due to atom-atom interaction, the linear coupling strength 
and the detuning, respectively. 
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FIG. S1: Schematic illustration of the Hamiltonian. The two main 
contributions are linear Rabi coupling with Rabi frequency Q (red) and 
nonlinear interaction (one-axis twisting) with strength x (blue). The 
effect of each part is visualized by several mean-field trajectories for the 
case of X = 0 and Q = 0 respectively. Arrows indicate the direction of 
movement. In the regime of A = Nx/^ > 1 the combination leads 
to a mean field phase space with three stable and one unstable fixed 
point (big sphere with A = 1.5). A separatrix (violet line) divides the 
phase space into three regions with macroscopically different temporal 
behavior. 

In the ideal case (constant parameters with Nx > ^ and 
J = 0) the early dynamics yield monotonically increasing Fisher 
information for an initial coherent spin state centered on the clas¬ 
sically unstable fixed point (eigenstate of J^^). In contrast, the 
maximally attainable spin squeezing is limited due to the bend¬ 
ing dynamics, which leads to an increase of the quantum uncer¬ 
tainty in all spin directions and a reduction of the spin squeezing 
for later evolution times. Fig. S2 shows numerical simulations 
of this ideal situation with typical experimental parameters of 
Nx/^ = 1.5, = 27r X 20 Hz and N = 430, where both 

Fisher information and spin squeezing have been optimized over 
rotation and measurement axis for each evolution time. This 
shows that the evolution creates non-Gaussian states useful for 
quantum metrology going beyond the quantum resources of the 
spin squeezed states accessible with this system. The Fisher 
information is found to saturate the Quantum Fisher Information 
for all evolution times, which is the maximum over all positive 
operator valued measures (POVM). This shows that the atomic 
imbalance stays the best observable. The experimentally ob¬ 
served decrease of the Fisher information as a function of time 
is a consequence of detection noise, technical imperfections 
and losses (detailed below). This basic behaviour is captured 
by including our detection noise of ±4 atoms as a Gaussian 
convolution of the theoretical probability distributions (Fig. S2, 
solid lines). The experimentally extracted Fisher information 
provides a lower bound for the quantum resources as it also 
includes all technical limitations. Since the quantum resources 
of the non-Gaussian states increasingly manifest themselves in 
substructures on small scales, they are especially affected by 
imperfect detection, which cannot resolve these features. 

In the experimental system 6 and x are both a function of 
the atom number [14], caused by the dependence of the shape 
of the BEG mean-field order parameter and the mean atom 
density on N. We find experimentally in good approximation 
S{N) ^ 27r X — 6]sf^/N^ with (5o = 16.3Hz and 6n = 

0.68 Hz/Vatom. The nonlinearity x(^) is oc l/\fN for our 
range of atom numbers and xo ^ 27r x 0.064 Hz for an initial 



Evolution time (ms) 


FIG. S2: Time evolution of Fisher information and squeezing The 

dashed lines show the ideal model predictions for constant parameters 
(A = 1.5, Q = 27r X 20 Hz, N = 430 and (5 = 0) obtained by exact 
diagonalization. Both Fisher information (red) and spin squeezing 
(blue) are optimized over tomography angle and readout axis. For short 
evolution times, Fisher information and the corresponding value 1 /<f ^ 
agree. The bending dynamics leads to a decrease of squeezing for 
later evolution times whereas the Fisher information monotonically 
increases. Taking into account our finite detection noise (±4) for 
the atom number (solid lines) reveals that detection noise is the main 
limitation for harnessing highly entangled states in an otherwise perfect 
system. The lower straight line corresponds to the standard quantum 
limit, the upper straight line is the value F — for a maximally 
entangled state. The inset shows a zoom into the region indicated by 
the grey box but scaled linearly. 


atom number of A'o = 470. Besides noise effects, atom loss 
during the time evolution thus leads to a time dependence of 
these parameters. 

Due to these relations, the shape of the final state strongly 
depends on atom number, as depicted in Fig. S3. The Husimi 
distributions for the final atom numbers N = 320, 380 and 440, 
obtained from the experimental tomographic reconstruction af¬ 
ter an evolution time of 25 ms, reveal a pronounced change in 
shape of the state with N. This is consistent with a numeri¬ 
cal simulation (Fig. S3 middle panel) taking into account the 
nonlinearity x = Xo\/^o/^(^) and detuning of 5{N{t)). This 
includes the atom number dependence of the parameters during 
the time evolution, assuming N{t) = with the experi¬ 

mentally determined decay time r = 110 ms due to atom loss. 
All fast pulses (with —ft{cos (l)Jx + sin^J^y) in the Hamilto¬ 
nian), including the spin-echo pulse in the middle of the time 
evolution, are modeled in the presence of nonlinearity and with a 
phase offset of = 3°, which was employed in the experiment 
to compensate for nonlinearity during the initial preparation 
pulse. The experimentally observed shape is well explained 
by the assumed model. It is mainly the finite detuning which 
leads to the asymmetry of the final state. This is reflected in the 
Husimi projections and also shows up as an asymmetry of the 
experimental probability distributions for the tomography angle 
Q/ = 58° yielding maximal Fisher information (right column of 
Fig. S3 for an evolution time of 26 ms). For a comparison with 
the model, detection noise of the absorption imaging is taken 
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into account by a convolution of the final distributions with a 
Gaussian of width adet = 6 atoms for Nfj — Na (grey curves). 
Furthermore, the effect of particle loss of ^ 100 atoms up to this 
evolution time leads to additional fluctuations which we estimate 
as (Jioss ~ 10, yielding a total width of atot ~ 12 atoms. With 
this convolution, the numerical simulations (red curves) are con¬ 
sistent with the observed experimental probability distributions. 


Experiment 25 ms Theory 26 ms, optimal Fisher information 



FIG. S3: Atom number dependence of the Hamiltonian. Experi¬ 
mental Husimi distributions of tomographically reconstructed density 
matrices (left panel) for N = 320, 380 and 440 after 25 ms of evolution 
time reveal the strong atom number dependence of the shape of the 
final state (a change of 60 atoms corresponds to a change of^ 1 Hz in 
detuning for the final atom number). The experimental distributions 
are in good agreement with numerical simulations (middle panel) in¬ 
cluding the atom number dependence of the parameters, which also 
dynamically change due to atom loss. Finite detuning during state gen¬ 
eration also shows up in the experimental probability distributions for 
a = 58° (maximal Fisher information) as depicted in the right panel 
for an evolution time of 26 ms. For the corresponding result from the 
theoretical model detection noise is included as a Gaussian convolution 
with cr = 6 atoms (grey curves). Assuming additional Gaussian noise 
due to atom loss (total a = 12, see text), the distributions are in very 
good agreement with the numerical simulations (red curves). 


Classical phase space 

A useful insight to the dynamics is offered by the mean field 
picture which is exactly valid for ^ oo and obtained by re¬ 
placing the quantum mechanical operators by their mean values 

l{Jx),{Jy),{Jz)) = 

where the imbalance z = (A^^ — A^a)/A^ is the normalized pop¬ 
ulation difference and 0 is the relative phase between the two 
internal states. In this limit, the dynamical evolution can be 
gathered in classical equations of motion of z and its conjugate 
variable (p [24, 34]. The Hamiltonian becomes 

H = ^ (^2^ - \/l- z'^cos(j)+ , (S2) 


which is formally equivalent to a non-rigid pendulum, where 
A = A^x/0. The equipotential lines of this classical Hamilto¬ 
nian are the classical trajectories. These are shown in Fig. SI 
for the three cases Q ^ Nx (red trajectories), (2 = 0 (blue 
trajectories) and (2 = A^x/1.5 (gray trajectories), correspond¬ 
ing to the three cases of dominating Rabi coupling, dominat¬ 
ing nonlinear evolution and the Josephson regime of weak 
Rabi coupling, respectively. For (2/0 the topology of the 
phase space only depends on the parameter A. For A > 1 
and (5 = 0 the phase space (z,(p) features three stable fixed 
points ((0,0) and (±>/l — 1/A^, tt)) and the unstable fixed 
point (0, tt) on the negative x-axis. An eight-shaped separa- 
trix passing through the unstable fixed point divides the phase 
space into three regions of macroscopically different temporal 
behavior [24, 34]. 


Experimental sequence 



FIG. S4: Experimental sequence for state generation and charac¬ 
terization. The first 7r/2 pulse prepares every atom in an equal su¬ 
perposition of the two internal states ja) and \b), eorresponding to the 
initial coherent spin state. The preparation on the unstable fixed point 
is accomplished by a nonadiabatic change of the radio frequency phase 
and attenuation of (2 for the evolution time te (not to scale). A spin 
echo TT-pulse is applied in the middle of the time evolution to suppress 
external detuning fluctuations. After state generation, rotation pulses 
(indicated by their respective angle) are applied for state characteriza¬ 
tion. The arrows in the coordinate system indicate the corresponding 
rotation axes. 


The experiment starts with a EEC of |1, — 1) in each poten¬ 
tial well. After ramping up the magnetic field to 9.12 G, a 
radio frequency rapid adiabatic passage transfers it to |a)^^ = 
|1, +1)^^. In Fig. S4 we show schematically the experimen¬ 
tal sequence beginning from this initial state. A fast 7r/2 Rabi 
pulse and subsequent non-adiabatic change of the driving phase 
by 37r/2 prepares each condensate in the coherent spin state 
(16) — |a))^^/2^/^, which corresponds to an independent su¬ 
perposition of each atom between the states \a) and \b) with the 
mean spin direction pointing towards the unstable fixed point. 
The state preparation pulses are sufficiently fast such that nonlin¬ 
ear effects induced by particle-particle interaction are negligible. 
The regime A > 1 is addressed by attenuating the Rabi coupling 
strength below the nonlinear interaction Nx- In the middle of 
the time evolution, a fast tt spin-echo pulse around the negative 
x-axis is applied in order to further reduce the effect of shot-to- 
shot fluctuations of the detuning S (caused by the finite magnetic 
field stability). 

The Rabi pulses are implemented with microwave and radio 
frequency magnetic fields 200 kHz red-detuned with respect to 
the |1,1) ^ |2, 0) and |2, —1) o |2, 0) transition, respectively. 
The resulting two-photon Rabi frequency for preparation, tt- 
pulse and tomography rotation is ^ 320 Hz (corresponding to 
A « 0.1) and is calibrated by Rabi flopping. The last ^-rotation 
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FIG. S5: Linear SU(2) interferometer and rotations on the Bloch 
sphere. The Mach-Zehnder interferometer (shown as sketch) is the 
optical analog of the atomic Ramsey sequence. Balanced beamsplitters 
(7r/2-pulses) perform rotations Rx by an angle of ttI2 and the relative 
phase shift inside the interferometer performs a rotation by the 
angle 0. Rx,y,z are the corresponding rotation matrices of SO(3). The 
sequence Rx^^‘^RIRZ^‘^ is mathematically equivalent to RI. For a 
characterization of the interferometric sensitivity it is thus sufficient to 
apply Ry only. 

is performed with an independently calibrated Rabi frequency 
of ~ 160 Hz to reduce the influence of spurious timing and 
switching effects. 

Spatial homogeneity 

Special care has to be taken to ensure homogeneity of all 
applied fields along the one-dimensional lattice since we want 
lattice sites separated by up to ^ 100 pm to bear consistent con¬ 
ditions. We compensate the gradient of the magnetic offset field 
by adjusting small permanent magnets near the experimental 
chamber such that a magnetically sensitive microwave Ramsey 
sequence on the transition |1,1) ^ |2,2)no longer shows any 
observable periodic pattern along the lattice. We note, that this 
measurement can be performed with higher accuracy than the 
on-site magnetic field stability because we have the advantage 
of the many-well information in every shot. It is only limited 
by the collisional (mean-field) shift of the two employed lev¬ 
els, which depends on the number of atoms on each lattice site. 
The relevant gradient of the microwave magnetic field can be 
measured by Rabi flopping on the transition |1,1) ^ |2, 0) with 
many cycles (^ 50), which eventually dephases along the lattice. 
This was minimized by small displacements of the microwave 
antenna and we find a remaining power gradient of 3.2 % over 
the whole ensemble which probably stems from the surrounding 
metal parts. For the fast rotation pulses, the influence of this 
remaining gradient on the homogeneity of the Rabi coupling is 


partly compensated by the gradient of the radio frequency power, 
which is on the same order but with opposite sign due to the 
special geometry and position of the coil of the radio frequency 
antenna. The remaining gradient of the Rabi coupling is ^ 0.7% 
over the whole BEC array, by which we can constrain the inho¬ 
mogeneity of a 7r/2-pulse to < 0.3 degrees in the relevant range 
of atom numbers with a maximum spacing of 20 wells. This 
corresponds to ^ 5% of the width (2 standard deviations) of a 
coherent spin state with 400 atoms. 

For the two-photon coupling, the gradients of the off-resonant 
microwave and radio frequency driving also cause inhomogene¬ 
ity of the AC Zeeman shifts. The absolute values are ^ 120 Hz 
(microwave) and ^ 70 Hz (radio frequency) which add up to 
~ 190 Hz total shift of the two-photon resonance in the case 
of the maximal power used for the fastest pulses. In order to 
reduce the influence of these gradients on the detuning during 
the time evolution, we distribute the attenuation over the two 
contributions. The microwave power is reduced by 11 dB with 
an attenuator on a fast MW switch with two ports. The remain¬ 
ing 14 dB attenuation needed to reach A 1.5 is achieved by 
reducing the power of the radio frequency. In this way, the gra¬ 
dient of the detuning d during the time evolution can be reduced 
to below 27r x 0.3 Hz over the whole ensemble. 

Long term stability and systematics 

The most critical parameter in the experimental sequence is 
the detuning 5 during the time evolution. Because the nonlinear 
term is on the order of Nx ^ 27r x 30 Hz, we have to work with 
Rabi frequencies of ^ 27r x 20 Hz to obtain A ^ 1.5, which 
makes the system sensitive to detuning on the level of less than 
1 Hz. Slow magnetic field changes due to small temperature 
drifts of the magnetic field sensor translate into detuning of 
^ lOHz/mG. To obtain consistent conditions, we perform 
automated Ramsey experiments on the two-photon transition 
after 30-40 experimental repetitions and adjust the setpoint of the 
magnetic field stabilization accordingly. Compared to magnetic 
field measurements on a linear Zeeman sensitive transition, this 
has the advantage that small drifts of the AC Zeeman shift 
are also compensated. The additional information from these 
periodic reference measurements is used to filter out repetitions 
where subsequent Ramsey experiments differ by more than 
1.5 Hz. Measurements of the Rabi frequency are performed on 
a daily basis and small drifts < 0.5% are corrected by slight 
adjustments of the radio frequency power. 

Because of the dependence of the AC Zeeman shift on the 
power of the driving fields, special care was taken to ensure the 
resonance condition separately for all rotation pulses on the level 
of ±1% of the respective Rabi frequency. For this, we measure 
small amplitude Josephson oscillations with mean relative phase 
0 of 0 (plasma oscillations) and tt (tt oscillations) and adjust 
the detuning such that they both have the same amplitude and 
offset in the imbalance z. The frequency difference of the same 
measurements is used to estimate the initial nonlinearity [24] . 

Systematic overestimation of the Fisher information from the 
fits to the Hellinger distance could be caused by an underes¬ 
timation of the effective Rabi frequency during the 6>-rotation 
pulses. To minimize effects of switching, we attenuate the radio 
frequency and work with an offset rotation angle of 6 degrees. 
Thus the smallest angle (3.5 degrees, corresponding ioO = —2.5 
degrees) still translates into a pulse duration of ^ 60 ps, which 
corresponds to ^ 360 cycles of the radio frequency. The round- 
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ing of the pulse length to full ps is included in the data analysis. 
From routinely performed measurements of the Rabi frequency, 
we can constrain the systematic error to < 1 % which translates 
into an uncertainty of A 6 > < 0.025 degrees and is negligible 
compared to the error bars of the squared Hellinger distance. 

Experimental probability distributions 

To obtain the experimental probability distributions for the 
Hellinger distance analysis, we collect the measurements of ^ 
in bins of width = 4/A after postselection for the total 
atom number N. This width is slightly smaller than the exper¬ 
imental resolution due to photon shot-noise of ~ ±4 atoms in 
the two population numbers Na and A^, which translates into 
an uncertainty of (A 2 :)psn ~ 6 /A. By varying the bin width, 
we observe the expected saturation of the Hellinger distances 
for bin widths between 2/A and ^ 5/A. The value 4/A is 
chosen to minimize artificial broadening of essential features 
of the distributions while still providing sufficient statistics in 
each bin. To obtain the experimental probabilities, each count is 
divided by the total number of counts in the distribution. For the 
Hellinger distance measurements, we typically collect ^ 2000 
experimental realizations atO = 0 and ^ 500 at 6 > 7 ^ 0. 


text, we average the values for at every tomography angle a 
over all settings of 0. We note that reported results have not been 
corrected for detection or technical noise contributions. Values 
stated in dB are calculated as ~ 10 log]^o(C^Ar))- 


Fisher Information and Cramer-Rao bound 


Let {Pz{0)} = {P{z\0)} be the conditional probability dis¬ 
tribution of the random variable Z, which continuously depends 
on the parameter 0, and 0{Z) an unbiased estimator of 0, i.e. 
(0) = 0, where (•) indicates the expectation value. We start 
from the equalities 




de 

de 


ae 

z 

'£p,{e) = o. 


(55) 

(56) 


With the assumption that the summing range does not depend 
on 6 >, we get 

y^(0- o)^PM = ((©- e)^\ogP,{e)\ = 1. (S7) 

2: ' ' 


Reconstruction of the density matrix 

For quantum state reconstruction, we collect the measure¬ 
ments of z for each tomography angle a in bins of width 
Az = 2 /A according to the granularity of the symmetric 
Hilbert space of J = A/2 with discrete eigenvalues {m} = 
{—A/2,..., A/2} of Jz. We then use the iterative algorithm 
described in [35] to obtain a maximum likelihood estimate of 
the density matrix p. For visualization, the Husimi projection 
(X ('^9, (/)|p|^, 0) on the coherent spin states [36] 


Taking the square of both sides, the Cauchy-Schwarz inequality 
if 9 )^ < {P){9^), where f = {0 - 9) and g = de logP^( 0 ), 
yields 

((0-0)2>^(^^logP,(0)^ ^ >1. (S 8 ) 

We thus obtain the Cramer-Rao bound A0^ > 1/F, where 
((0 — 6 >)^) is the variance A 0 ^ of the estimator 0 and 



1/2 


|J,m) 


(S3) 


with T = exp(—tan('i9/2) is calculated on a grid of 255 x 
255 points for the azimuthal angle (j) and the polar angle 'd, 
respectively. For the density plots in Fig. 2A of the main text, 
the obtained values are divided by their maximum. 


F = ^PM(J^iogPM^ (S9) 

is the Fisher information. The extension to the case of m in¬ 
dependent measurements leads to A0^ > XjmP [17], which 
is a natural extension obeying the extra scaling with 1 /m for 
multiple measurements. 


Squeezing analysis 


The separable coherent spin state (Eq. S3) has the binomial 
variance (A2:^)css = (4/A^)AJ^ = 4p(l — p)/A in where 
p = ((^) + l)/2. This is a consequence of the spin uncertainty 
of A atoms independently prepared in the same superposition 
state. We refer to the value = A 2 (^/(A 2 (^)css as number 
squeezing factor and to the value 


^2 ^ SAT 


A 


4p(l-p)V2 




(S4) 


as spin squeezing factor. V < 1 is the visibility of a perfect 
Ramsey experiment, which is limited by the elongation of the 
quantum state leading to a reduction of the mean spin length 
(J) [ 6 ]. We estimate V from auxiliary interleaved measurements 
with the tomography angle a that results in the biggest variance 
of (longest axis of the state). From these measurements, we 
deduce the normalized mean spin length (cos(7r/2 — ^)) = 
(Vl — — V. For the results shown in Fig. 2C of the main 


Extraction of the Fisher information from the Hellinger distance 

We first illustrate the basic theory by considering the ideal 
situation where the probabilities are known. Later we will con¬ 
sider the experimentally relevant case where relative frequencies 
(experimental probabilities) are acquired. 

Ideal case: probabilities. The squared Hellinger distance be¬ 
tween the probability distributions po = {^^( 0 )} at 6 >o =0 and 
pe = {Pz{0)} at finite 0 is 

dUPo,Pe) = 1-5] pPM Pz{0), (SIO) 

Z 

where the sum, generally referred to as the Bhattacharyya co¬ 
efficient (statistical fidelity or overlap), extends to all values of 
z. For small i.e. closely spaced probability distributions, the 
Taylor expansion of the squared Hellinger distance yields 

di{po,P0) = jO^ + ^o^ + o{e% (sii) 
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where F is the value of the Fisher information (Eq. S9) at ^ = 0 
and F' = dF{0)/dO\e={) its derivative. The zeroth order of 
the Taylor expansion vanishes because Pz{^) = 1 and the 
first order vanishes because ^ Pz{^) = 0. Thus, the Fisher 

information can be extracted from a polynomial fit to d‘^{po,po) 
by extracting the coefficient of the quadratic term. Note, that 
F > 0, while F' and the higher order terms of the expansion 
can also be negative. By shifting the reference frame to Oq ^ 0, 
this procedure can easily be generalized to obtain the Fisher 
information at arbitrary values of 0. 

Experimental case: frequencies. The method illustrated above to 
extract the Fisher information is valid for any state, phase shift 
operation and measurement. However, it requires the knowledge 
of the exact probability distributions. The corresponding experi¬ 
mentally obtainable value is the relative frequency (experimental 
probability) distribution {Fz{d)} for given 0, which approaches 
{Pz{0)} for infinitely many independent measurements. We 
thus consider here the extraction of the Fisher information from 
a natural extension of the Hellinger distance (Eq. SIO), 


4(/o, fe) = l-Y. (S 12 ) 

Z 


where /o = {*^ 2 ( 0 )} and fo = {Fz{d)} dXQ the outcome fre¬ 
quencies obtained from a sample of M experimental realizations. 
Due to statistical fiuctuations, ^^(/o, fe) varies when repeating 
the measurement. We write 

Fz{0) = Pz{0)FSFz{0)^ (S13) 

where {Pz{0)} is the true underlying probability distribution 
(from which {Fz{d)} is sampled) and 6Fz{d) are the multino¬ 
mial sampling errors characterized by 

{SFzm = 0 

(S14) 

Co-v[5F,{0),5:F,,{e)] = forz^z' 

Because of the normalization, frequency fluctuations sum to 
zero: SFz{d) = 0. We now expand Eq. S12 in the Taylor 

series around ^0 = 0 and SFz{d) Pz{0). Since {6Fz{d)) = 
0 , this is justified if y/{6Fz{d)‘^) ^ Pz{d) which, according 
to Eq. S14, corresponds to the condition Pz{0) > 1 /(M + 1 ). 
This is satisfied for M sufficiently large. Taking the sample 
average, we find 

+ C 2 ^ + (S15) 

with 


fo)) ~ ^0 + ( 


Co 


C2 


n — 1 


4M 

1 

32M 


F + Y,{de log P,{0)f 

Z 


(S16) 


where n is the number of discrete values of z for which 
Pz{0) 7 ^ 0. Since F is the expectation value of {do log Pz{0))‘^ 
at ^ = 0, we can estimate C 2 ~ F{1 + n)/(32M). We em¬ 
phasize that the first order term in the ^-expansion of Eq. S12 
vanishes. Notably, this happens also in the presence of frequency 


fluctuations. The estimation of the Hellinger distance with ex¬ 
perimental probability distributions is asymptotically unbiased. 
The bias decreases with 1/M and is due to the finite fluctuations 
on the estimated probabilities and their strict positiveness. It can 
be reduced by employing a resampling procedure (see below). 
Regarding the statistical error of the Hellinger distance, we find 

{AdUfoJe))" = Fo^ + O{0^5F^). (S17) 


Resampling procedure 

Both Co and C 2 in Eq. S15 scale as 1/M, which is a prereq¬ 
uisite of the Jackknife procedure for bias reduction [37]. We 
divide the M experimental realizations in g blocks of h samples 
(M = hg). Eor the calculation of the Jackknife estimates we 
use (< 7 ^) 2 , that is the squared Hellinger distance evaluated with 
the ith group of experimental results of block size h removed 
{i = 1,..., ^). The Jackknife estimator 

(<^h)j = ~ (S18) 

d i=i 

has the property to eliminate the 1/M-term from the estimate 
(4>- Eurthermore, the variance of {{d^)i} is used to esti¬ 
mate the statistical uncertainty of The results of this block- 
Jackknife are averaged up to blocksize h = 20 for mean and 
variance. We verified the validity of this approach with Monte- 
Carlo simulations including realistic probability distributions 
and typical experimental sample sizes. 


Bayesian estimation 


Let Oq be the true value of the phase shift and {zi}m = 
{zi ,..., 2 ;^} a sequence of m independent measurement results, 
obtained with probability P{;,.}^(6>o) = Pzi{0o)- The 
Bayes theorem 


PoiiZijm) 


P{z.U{0)Pi0) 

P{{Zi}m) 


(S19) 


assigns a probability distribution Pe{{zi}m) to the variable 
0, conditioned by the measurement sequence { 2 :^}^. Here 
P{0) represents the prior knowledge about the phase shift 
0 which is assumed to be phase-independent in our analy¬ 
sis, and P{{zi}m) provides the normalization of P 9 {{zi}m)- 
The shape of Pe{{zi}m) depends on the specific probabili¬ 
ties (0) and on the observed results. Nevertheless, in the 
limit m 00 , Po{{zi}m) becomes normally distributed, cen¬ 
tered around the true value of the phase shift and with vari¬ 
ance = 1/mP. This result is generally referred to as the 
Laplace-Bernstein-von Mises theorem [38]. To demonstrate 
it, we rewrite Pe{{zi}m) in terms of frequencies Fz{0q) to ob¬ 
serve the result Pe{{Zi}m) oc (^ 0 )^ where the 

proportionality sign indicates equality up to a normalization 
constant. In the limit m ^ 00 , frequencies tend to probabilities 
(Pz(do) Pz(do)) and thus 


^ogPe{{z^}m) log Pe{z) + const., (S20) 

z 


where the constant term contains the normalization of 
Pe{{zi}m)- We now expand this equation in the Taylor se¬ 
ries around the maximum of P^d^^}^). Eirst, the condition 
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de log Pe{{zi}m) = 0 is satisfied if 0 = Oq. We will assume 
that Oq is the only maximum of Eq. S20. Up to the leading order 
in m and neglecting constant terms, which are absorbed in the 
normalization of we thus have 

\ogPe{{zi}rr,)^-'^:^^^{e-0of. (S21) 

The fact that P > 0 guarantees that higher order terms of the 
expansion are negligible for m ^ oo. 

For the Bayesian analysis presented in the main text we take 
advantage of the fact that the experimental probability distribu¬ 
tion used as P^ (0) (reference) for the calculation of the Hellinger 
distance was collected with approximately four times the experi¬ 
mental repetitions compared to Pz{0 0). We randomly draw 
1000 repetitions Zi of this setting and use the remaining ones to 
generate the distribution Pz{0). We then determine the region 
a < z <b, where Pz{0j) > 0 for all Oj. The 1000 repetitions 
are used as independent realizations for Bayesian estimation of 


0 and grouped in m-sequences {zi}m- For each m-sequence 
we calculate the likelihood C{0j) = Y\T=i (equivalent 

to Pej{{zi}m) without normalization and P{0) = 1) for the 
(discrete) values of Oj for which experimental probabilities are 
available. Realizations with Pz.{0j) = 0 for some Oj, i.e. out¬ 
side the range [a, b] are discarded. This is an essential step to 
obtain a finite value of the likelihood for each sequence. Note, 
that we do not reduce the value of m in this step. Discarding too 
many realizations would thus show up as reduced sensitivity in 
the further analysis. 

We use no prior knowledge on the phase shift, i.e. P{0) = 
1. As discussed above, we expect P 9 j{{zi}m) and thus the 
likelihood jC{0j) to be normally distributed around the true value 
of the phase shift 0 = 0. We thus fit the results for log jC{0j) 
quadratically and extract a from the curvature of the fit. As 
discussed in the main text, for sufficiently large m (see Fig. 4), 
we get ^ 1/mP, where F agrees with the value of the 
Fisher information obtained with the Hellinger distance method. 



